Sequencing and Raw Sequence Data Quality Control ◾ 39
do not meet the criteria entirely, this program cuts only the bases, whose quality scores are
less than the specified threshold, from the ends of the reads.
fastq_quality_trimmer \
-i bad_filt.fastq \
-t 28 \
-o bad_filt_trim.fastq \
-Q33
fastqc bad_filt_trim.fastq
htmlfiles=$(ls *.html)
firefox $htmlfiles
The “-t” option specifies the quality threshold, which is the minimum quality score below
which the bases will be trimmed from the ends of the reads. When trimming is performed,
the resulted reads may be of unequal lengths, which may not be accepted by some pro-
grams used in following steps of analysis. As shown in Figure 1.33, although the per base
sequence quality has been improved by trimming, it also raised a sequence length distribu-
tion warning since trimming resulted in reads with unequal lengths. We may need to filter
reads by length.
FIGURE 1.33 The QC report of the filtered and trimmed “bad.fastq” file.